Greedy Feature Selection for Subspace Clustering Greedy Feature Selection for Subspace Clustering

نویسندگان

  • Eva L. Dyer
  • Aswin C. Sankaranarayanan
  • Richard G. Baraniuk
چکیده

Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation—the identification of points that live in the same subspace—and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection (EFS)—recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with `1-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the dataset is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Greedy feature selection for subspace clustering

Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation—the identification of points that live in the same subspace—and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a pro...

متن کامل

Functional Subspace Clustering with Application to Time Series

Functional data, where samples are random functions, are increasingly common and important in a variety of applications, such as health care and traffic analysis. They are naturally high dimensional and lie along complex manifolds. These properties warrant use of the subspace assumption, but most state-of-the-art subspace learning algorithms are limited to linear or other simple settings. To ad...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

Feature Selection based Semi-Supervised Subspace Clustering

Clustering is the process which is used to assign a set of n objects into clusters(groups). Dimensionality reduction techniques help in increasing the accuracy of clustering results by removing redundant and irrelevant dimensions. But, in most of the situations, objects can be related in different ways in different subsets of the dimensions. Dimensionality reduction tends to get rid of such rel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012